Conversation
paddle/operators/reduce_op.cc
Outdated
There was a problem hiding this comment.
我觉得将min,max等作为attr来写,注册一个kernel,这样cc文件会更加简短精炼。
目前ReduceMinOpMaker和ReduceMaxOpMaker等基本都是重复的。现在分开写了四个kernel,写成一个后,会省将近3/4的代码。
There was a problem hiding this comment.
抱歉啊,这个回复的晚了,这里参考了下tensorflow的做法 https://github.com/tensorflow/tensorflow/blob/216dcbf1e08c02d87774120ebd5b251c5c30c56c/tensorflow/core/kernels/reduction_ops_sum.cc#L26 ,另外pytorch中也有类似的reduce操作 https://github.com/pytorch/pytorch/blob/master/torch/autograd/_functions/reduce.py ,感觉可能分为多个OP在意义上更清楚一些,另外将min、max作为attr感觉会kernel部分的代码会比较长,后续可能也会加下其他的reduce操作,functor的话还有一个潜在好处是的可以复用目前的kernel不太容易复用,我也不能确定哪种会更好。多谢评论与思考建议~
paddle/operators/reduce_op.h
Outdated
There was a problem hiding this comment.
这里对应到EigenTensor的几种rank,由于EigenTensor是用的template,这里显示写了出来
paddle/operators/reduce_op.h
Outdated
There was a problem hiding this comment.
如果作为attr来写,这里也不用functor,用switch来做,29-86行中一半的代码都能省出来。
There was a problem hiding this comment.
Functor的话还有一个潜在好处是的可能比较容易复用,目前的kernel好像不太容易复用
paddle/operators/reduce_op.cu
Outdated
There was a problem hiding this comment.
目前实现是,每个循环里调用一次eigen的gpu kernel,效率会比较慢。可以考虑单独写。
There was a problem hiding this comment.
这个不是特别什么意思,可以稍具体说明下么~
There was a problem hiding this comment.
定义一个父类,然后TestMeanOp(xxxx),然后test_check_out等函数就都可以重用了。
There was a problem hiding this comment.
好的建议,多谢。这里reduce_max和reduce_min不进行梯度检测,几种OP中test的内容不是特别统一,感觉就先行不改了。
paddle/operators/reduce_op.cc
Outdated
There was a problem hiding this comment.
Sum,Mean,Max,Min的几个OpMaker基本都是一样的。这些代码其实可以用一个基类实现,然后子类来处理不相同的部分。
参考一下:#4139 ElementwiseOpmaker。
paddle/operators/reduce_op.cc
Outdated
There was a problem hiding this comment.
咱们的Tensor最多支持9维。这个6维是怎么来的?
There was a problem hiding this comment.
这里目前参考的是crop_op中的https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/crop_op.h ,可能确实需要再统一下。
ebcf710 to
8b3bf28
Compare
paddle/operators/reduce_op.cc
Outdated
There was a problem hiding this comment.
ReduceMean -> ReduceSum
paddle/operators/reduce_op.cc
Outdated
| namespace operators { | ||
|
|
||
| using framework::Tensor; | ||
| using framework::LoDTensor; |
paddle/operators/reduce_op.cc
Outdated
| dims_vector.erase(dims_vector.begin() + dim); | ||
| } | ||
| auto out_dims = framework::make_ddim(dims_vector); | ||
| ctx.Output<framework::LoDTensor>("Out")->Resize(out_dims); |
There was a problem hiding this comment.
framework::LoDTensor -> framework::Tensor
paddle/operators/reduce_op.cc
Outdated
| AddOutput("Out", "(Tensor) The result tensor."); | ||
| AddAttr<int>("dim", | ||
| "(int, default 0) The dimension to reduce. " | ||
| "Must be in the range [-rank(input), rank(input))") |
There was a problem hiding this comment.
Add more comments for the dim, or add some examples in the Doc.
这里-dim是倒数第几维吧? 加下注释吧。
| auto equals = x == y.broadcast(dim); | ||
| auto ones = dx.constant(1); | ||
| auto zeros = dx.constant(0); | ||
| dx.device(place) = dy.broadcast(dim) * equals.select(ones, zeros); |
There was a problem hiding this comment.
如果max/min值有多个,backward梯度传播的时候,这里的实现多个max值的梯度都为max的grad,而不是一些max值的梯度为0,另外,和 @guoshengCS 讨论,TF这里对多个max/min值的梯度,取了平均: https://github.com/tensorflow/tensorflow/blob/37f7ad75bbd2ca140d1092342eb3590d54193bc8/tensorflow/cc/gradients/math_grad.cc#L711
咱们这里的处理加下注释吧~
paddle/operators/reduce_op.h
Outdated
| auto* input2 = context.Input<Tensor>(framework::GradVarName("Out")); | ||
| auto* output = context.Output<Tensor>(framework::GradVarName("X")); | ||
|
|
||
| if (output != nullptr) { |
There was a problem hiding this comment.
如果backward的输出只有一个的话,内部实现不用考虑output == nullptr,这种case,在https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/backward.cc#L67 里会返回NOP。
paddle/operators/reduce_op.h
Outdated
|
|
||
| private: | ||
| template <size_t D> | ||
| void ReduceCompute(const framework::ExecutionContext& context) const { |
There was a problem hiding this comment.
为了和上面ReduceCompute区别,这里是否要叫做ReduceGradCompute?
paddle/operators/reduce_op.h
Outdated
|
|
||
| // For EigenTensor unsupported reduce | ||
| template <typename T, typename Functor> | ||
| class ReduceGradEigenFreeKernel : public framework::OpKernel { |
7651339 to
477a6a0
Compare
Resolves #4060